Measuring Program Behavior Similarity

نویسنده

  • David Lilja
چکیده

D esigners of microarchitectures for generalpurpose microprocessors once based their design decisions on experts’ intuition and rules of thumb. Since the mid-1980s, however, microarchitecture research has become a systematic process that uses simulation tools extensively. Although architectural simulators model microarchitectures at a high abstraction level, the increasing complexity of both the microarchitectures and the applications that run on them make these simulators very time-consuming. Simulators must execute huge numbers of instructions to create a workload representative of real applications. The Standard Performance Evaluation Corporation’s (SPEC) CPU2000 benchmark suite, for example, has many more dynamic instructions than CPU95, which it replaced. Although real hardware evaluations benefit from this increase, using architectural simulators for such large numbers of instructions becomes infeasible. The dynamic instruction count of the SPEC2000 benchmark parser with reference input is about 500 billion instructions, or three weeks of simulation at 300,000 instructions per second. Including the benchmarks that must be run for a huge number of design points creates an unreasonably long simulation time, stretching the time to market. Running the simulations in parallel results in a huge equipment cost. To solve this problem, we can use reduced input sets instead of reference input sets. The ideal reduced input set has a limited dynamic instruction count but produces program behavior comparable to the reference input set behavior. MinneSPEC collects a number of reduced input sets for some CPU2000 benchmarks. It proposes three reduced inputs: smred for short simulations, mdred for mediumlength simulations, and lgred for full-length, reportable simulations. Although a number of techniques—such as truncating or modifying the inputs—can derive these reduced input sets from the reference inputs, it is unclear whether these reduced input sets will produce behavior similar to a program using a reference input set. We have developed a methodology that reliably quantifies program behavior similarity. As such, we can validate MinneSPEC—that is, we can verify whether the reduced input sets result in program behavior similar to the reference inputs. To overcome the shortcomings of previous work, our methodology uses metrics that are closely related to performance. We also use statistical data analysis techniques to calculate the similarity in program behavior based on uncorrelated workload characteristics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A program behavior recognition algorithm based on assembly instruction sequence similarity

The analysis on assembly instruction sequence plays a vital role in the field of measuring software similarity, malware recognition and software analysis, etc. This paper summarizes the features of assembly instructions, builds a six-group model and puts forward an algorithm of calculating similarity of assembly instructions. On that base a set of methods of calculating similarity of assembly i...

متن کامل

Translation Invariant Approach for Measuring Similarity of Signals

In many signal processing applications, an appropriate measure to compare two signals plays a fundamental role in both implementing the algorithm and evaluating its performance. Several techniques have been introduced in literature as similarity measures. However, the existing measures are often either impractical for some applications or they have unsatisfactory results in some other applicati...

متن کامل

Translation Invariant Approach for Measuring Similarity of Signals

In many signal processing applications, an appropriate measure to compare two signals plays a fundamental role in both implementing the algorithm and evaluating its performance. Several techniques have been introduced in literature as similarity measures. However, the existing measures are often either impractical for some applications or they have unsatisfactory results in some other applicati...

متن کامل

Measuring the Similarity of Trajectories Using Fuzzy Theory

In recent years, with the advancement of positioning systems, access to a large amount of movement data is provided. Among the methods of discovering knowledge from this type of data is to measure the similarity of trajectories resulting from the movement of objects. Similarity measurement has also been used in other data mining methods such as classification and clustering and is currently, an...

متن کامل

Measuring the Structural Similarity of Web-based Documents: A Novel Approach

Most known methods for measuring the structural similarity of document structures are based on, e.g., tag measures, path metrics and tree measures in terms of their DOM-Trees. Other methods measures the similarity in the framework of the well known vector space model. In contrast to these we present a new approach to measuring the structural similarity of web-based documents represented by so c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001